Super-resolution from handheld camera

ABSTRACT

An apparatus and method for increasing the resolution of an image are provided. The method includes capturing a plurality of frames of an image, determining a reference frame from among the plurality of frames, iteratively determining an offset of each of the plurality of frames to the reference frame until unity scaling is reached, and determining a pixel value for insertion between pixels of the reference frame.

CROSS REFERENCE TO RELATED APPLICATION(S)

This application claims the benefit under 35 USC §119(e) of U.S. Provisional Application No. 62/011,311, filed Jun. 12, 2014, the entire disclosure of which is hereby incorporated by reference.

TECHNICAL FIELD

The present disclosure relates to an apparatus and method for providing an image. More particularly, the present disclosure relates to an apparatus and method for providing an image having increased-resolution.

BACKGROUND

Mobile terminals were first developed to provide wireless communication between users. As technology has advanced, mobile terminals now provide many additional features beyond the simple telephone conversation. For example, mobile terminals are now able to provide advanced functions such as an alarm, a Short Messaging Service (SMS), a Multimedia Messaging Service (MMS), E-mail, games, short range communication, an image capturing function using a mounted digital camera, a multimedia function for providing audio and video content, a scheduling function, and many more. With the plurality of features now provided, a mobile terminal has effectively become a necessity of daily life for most people.

As is known in the art, an image may be captured by a digital camera mounted on the mobile terminal. For example, when a user selects an image capturing function, a Graphical User Interface (GUI) may be displayed, allowing the user to select a capturing button of the GUI to ultimately capture a desired image. When capturing an image, an image sensor of the digital camera is controlled to receive information on a plurality of photosites. However, the number of photosites may be limited and thus provide an image having a low resolution. Accordingly, there is a need for an improved apparatus and method for providing an image having super-resolution using an existing image sensor.

The above information is presented as background information only to assist with an understanding of the present disclosure. No determination has been made, and no assertion is made, as to whether any of the above might be applicable as prior art with regard to the present disclosure

SUMMARY

Aspects of the present disclosure are to address at least the above-mentioned problems and/or disadvantages and to provide at least the advantages described below. Accordingly, an aspect of the present disclosure is to provide an apparatus and method for providing an image having super-resolution.

In accordance with an aspect of the present disclosure, a method for increasing the resolution of an image is provided. The method includes capturing a plurality of frames of an image, determining a reference frame from among the plurality of frames, iteratively determining an offset of each of the plurality of frames to the reference frame until unity scaling is reached, and determining a pixel value for insertion between pixels of the reference frame.

In accordance with another aspect of the present disclosure, an apparatus for increasing the resolution of an image is provided. The apparatus includes a camera unit configured to capture a plurality of frames of an image, and a control unit configured to determine a reference frame from among the plurality of frames, to iteratively determine an offset of each of the plurality of frames to the reference frame until unity scaling is reached, and to determine a pixel value for insertion between pixels of the reference frame.

Other aspects, advantages, and salient features of the disclosure will become apparent to those skilled in the art from the following detailed description, which, taken in conjunction with the annexed drawings, discloses various embodiments of the disclosure.

BRIEF DESCRIPTION OF THE DRAWINGS

The above and other aspects, features, and advantages of various embodiments of the present disclosure will be more apparent from the following description taken in conjunction with the accompanying drawings, in which:

FIGS. 1A-1C illustrate the motion of a camera as a user attempts to take a still shot of a stationary object according to an embodiment of the present invention;

FIGS. 2A-2D illustrate a method of improving the resolution of an image for display by overlapping frames of a captured image according to an embodiment of the present disclosure;

FIGS. 3A-3C illustrate a process of capturing and aligning two low resolution images according to an embodiment of the present disclosure;

FIG. 4 illustrates a progressive approximation for aligning image captures according to an embodiment of the present disclosure;

FIGS. 5A and 5B illustrate two iterations of a progressive approximation method according to an embodiment of the present disclosure;

FIG. 6 illustrates a result of four iterations of a progressive approximation according to an embodiment of the present disclosure;

FIGS. 7A and 7B illustrate an eighth iteration and ninth iteration to determine fractional pixel offsets according to an embodiment of the present disclosure;

FIGS. 8A-8C illustrate two image captures requiring additional morphing techniques according to an embodiment of the present disclosure;

FIGS. 9A-9C illustrate a hybrid global and local registration algorithm for linearly distorted frames according to an embodiment of the present disclosure;

FIG. 10 illustrates a progressive approximation for aligning image captures according to another embodiment of the present disclosure;

FIG. 11 is a flowchart illustrating a method for performing a progressive approximation for aligning image captures according to an embodiment of the present disclosure;

FIG. 12 is a block diagram schematically illustrating a configuration of an electronic device according to an embodiment of the present disclosure; and

FIG. 13 is a block diagram of an applications processor configured to perform a progressive approximation for a super-resolution image according to an embodiment of the present disclosure.

Throughout the drawings, it should be noted that like reference numbers are used to depict the same or similar elements, features, and structures.

DETAILED DESCRIPTION

Detailed descriptions of various aspects of the present disclosure will be discussed below with reference to the attached drawings. The descriptions are set forth as examples only, and shall not limit the scope of the present disclosure.

The detailed description with reference to the accompanying drawings is provided to assist in a comprehensive understanding of various embodiments of the disclosure as defined by the claims and their equivalents. It includes various specific details to assist in that understanding but these are to be regarded as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the disclosure. In addition, descriptions of well-known functions and constructions are omitted for clarity and conciseness.

The terms and words used in the following description and claims are not limited to the bibliographical meanings, but, are merely used by the inventor to enable a clear and consistent understanding of the disclosure. Accordingly, it should be apparent to those skilled in the art that the following description of various embodiments of the present disclosure are provided for illustration purpose only and not for the purpose of limiting the disclosure as defined by the appended claims and their equivalents.

It is to be understood that the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to “a component surface” includes reference to one or more of such surfaces.

By the term “substantially” it is meant that the recited characteristic, parameter, or value need not be achieved exactly, but that deviations or variations, including for example, tolerances, measurement error, measurement accuracy limitations and other factors known to those of skill in the art, may occur in amounts that do not preclude the effect the characteristic was intended to provide.

Unless defined differently, all terms used in the present disclosure, including technical or scientific terms, have meanings that are understood generally by a person having ordinary skill in the art. Ordinary terms that may be defined in a dictionary should be understood to have the meaning consistent with their context, and unless clearly defined in the present disclosure, should not be interpreted to be excessively idealistic or formalistic.

According to various embodiments of the present disclosure, an electronic device may include communication functionality. For example, an electronic device may be a smart phone, a tablet Personal Computer (PC), a mobile phone, a video phone, an e-book reader, a desktop PC, a laptop PC, a netbook PC, a Personal Digital Assistant (PDA), a Portable Multimedia Player (PMP), an MP3 player, a mobile medical device, a camera, a wearable device (e.g., a Head-Mounted Device (HMD), electronic clothes, electronic braces, an electronic necklace, an electronic appcessory, an electronic tattoo, or a smart watch), and/or the like.

According to various embodiments of the present disclosure, an electronic device may be a smart home appliance with communication functionality. A smart home appliance may be, for example, a television, a Digital Video Disk (DVD) player, an audio, a refrigerator, an air conditioner, a vacuum cleaner, an oven, a microwave oven, a washer, a dryer, an air purifier, a set-top box, a TV box (e.g., Samsung HomeSync™, Apple TV™, or Google TV™), a gaming console, an electronic dictionary, an electronic key, a camcorder, an electronic picture frame, and/or the like.

According to various embodiments of the present disclosure, an electronic device may be a medical device (e.g., Magnetic Resonance Angiography (MRA) device, a Magnetic Resonance Imaging (MRI) device, Computed Tomography (CT) device, an imaging device, or an ultrasonic device), a navigation device, a Global Positioning System (GPS) receiver, an Event Data Recorder (EDR), a Flight Data Recorder (FDR), an automotive infotainment device, a naval electronic device (e.g., naval navigation device, gyroscope, or compass), an avionic electronic device, a security device, an industrial or consumer robot, and/or the like.

According to various embodiments of the present disclosure, an electronic device may be furniture, part of a building/structure, an electronic board, electronic signature receiving device, a projector, various measuring devices (e.g., water, electricity, gas or electro-magnetic wave measuring devices), and/or the like that include communication functionality.

According to various embodiments of the present disclosure, an electronic device may be any combination of the foregoing devices. In addition, it will be apparent to one having ordinary skill in the art that an electronic device according to various embodiments of the present disclosure is not limited to the foregoing devices.

The term “super resolution” refers to a process of the related art that is performed to enhance an image using a sequence of captured frames of the image abstracted from a video. In the related art, motion within the video sequence, either through panning, or motion of objects within the sequence, is used to introduce non-redundant additional samples of the image. This sequence of images, or samples, is composited into a single static image wherein the non-redundant samples are used to fill in a higher order mosaic of potential sample point locations. From this, a higher resolution image (i.e., an image having super-resolution) may be generated.

As part of the present invention, it has been discovered that when a human being attempts to hold a camera still for a photograph, the user naturally “shakes” or moves the mobile terminal in a random pattern around the desired framing position. An aspect of this invention takes advantage of this natural movement.

FIGS. 1A-1C illustrate the motion of a camera as a user attempts to take a still shot of a stationary object according to an embodiment of the present invention.

Referring to FIG. 1A, a mobile terminal was held in portrait orientation and placed in burst mode to capture a still image. The camera captured 20 shots of the subject of the image over a 1.5 sec duration, the location of each shot denoted by a corresponding numeral in FIG. 1A (i.e., the first shot was obtained while the camera was at location “shot #1”). As evidenced from FIG. 1A, the mobile terminal moved over a horizontal distance of 82 pixels and appeared to have a random motion.

Referring to FIG. 1B, a mobile terminal was held in landscape orientation and placed in burst mode to capture a still image. Again, the camera captured 20 shots of the subject of the image over a 1.5 sec duration, the location of each shot denoted by a corresponding numeral in FIG. 1B (i.e., the first shot was obtain while the camera was at location “shot #1”). As evidenced from FIG. 1B, the mobile terminal moved over a horizontal distance equal to 50 pixels of the camera and again appeared to have a random motion.

Referring to FIG. 1C, a mobile terminal was placed in continuous shooting mode for 5 sec. During that time, the locations of 10 shots of the camera were considered and are illustrated in FIG. 1C. As again evidenced, the motion of the camera appears to be random.

As evidenced by FIGS. 1A-1C, when a user attempts to take a still photo of a subject in burst mode, each successive shot or exposure is slightly offset from the previous shot due to the natural movement of the user. Being offset, the real world image focused upon the image sensor may land in different phases on photo-sites of the camera, thus taking a sample of the image at different points with each exposure. When properly aligned with one another, the multiple offset samples from each exposure effectively form a virtual camera with a larger number of photo-sites. In other words, a camera with higher resolution.

FIGS. 2A-2D illustrate a method of improving the resolution of an image for display by overlapping frames of a captured image according to an embodiment of the present disclosure.

Referring to FIGS. 2A-2D, it is assumed that a user is attempting to capture an image of a stationary object that is represented by the signal 201. In FIG. 2A, a first frame (i.e., frame 0) is captured by the digital camera. As is known in the art, the digital camera used to capture the stationary image includes a plurality of photo-sensors that are respectively located at set positions (i.e., photo-sites) within the digital camera. In FIGS. 2A and 2B, the locations of the photo-sites corresponding to the photo-sensors are illustrated by reference numerals 210-1, 210-2, 210-3, 210-4 and 210-5. Of course, it is understood that only five photo-sites are illustrated for sake of brevity and convenience of description and not by way of limitation. As illustrated in FIG. 2A, each photo-sensor located at photo-sites 210-1˜210-5 samples the stationary object signal 201 to produce an output 220-0 corresponding to the stationary object 201 at that location.

In FIG. 2B, a second frame (i.e., frame 1) is captured by the digital camera and each photo-sensor located at photo-sites 210-1˜210-5 again samples the stationary object signal 201. However, in FIG. 2B, while the location of the stationary object remains the same, as evidenced by the stationary object signal 201, the digital camera itself has moved as illustrated by offset 240. In this case, when each photo-sensor located at photo-sites 210-1˜210-4 again samples the stationary object signal 201, the photo-sites produce an output 220-1 corresponding to the stationary object 201, now at the offset location. Here, it is assumed that the offset 240 is less than the pitch of photo-sites within the digital camera.

FIG. 2C illustrates an overlapping of the samples 220-0 from frame 0 and the samples 220-1 from frame 1. As can be seen in FIG. 2C, the overlapped samples 220-0 and 220-1 are illustrated relative to the stationary object signal 201. That is, the overlapped samples 220-0 are offset from corresponding samples 220-1 by offset 240.

In FIG. 2D, the samples 220-0 and 220-1 are averaged and the results are graphed in comparison to the stationary object signal 201. As can be seen in FIG. 2D, the average of samples 220-0 and 220-1 produces a more accurate representation of the stationary object signal 201 as compared to either the samples 220-0 of frame 0 or the sample 220-1 of frame 1, effectively doubling the resolution of the digital camera and producing a higher quality image.

FIGS. 3A-3C illustrate a process of capturing and aligning two low resolution images according to an embodiment of the present disclosure.

The example illustrated in FIGS. 3A-3C is substantially similar to that illustrated in FIGS. 2A-2D. However, in FIGS. 3A-3C, an entire image is illustrated and considered, rather than only a few photo-sites.

Referring to FIG. 3A, a first image 301A is captured as part of a succession of image captures. In this example, the first image 301A of FIG. 3A is considered the reference image.

Referring to FIG. 3B, a second image 301B of the same subject is captured as a second in the succession of image captures. As discussed above, due to a natural movement of the user, the second image 301B is offset from the first image 301A as seen by the differences in proximity of the subject of the image to the border of the image in each capture.

Referring to FIG. 3C, when the second image 301B is overlaid on the first image 301A, an offset 310 can be seen. That is, when the second image 301B of FIG. 3B is overlaid and aligned on the first image 301A of FIG. 3A (i.e., the reference image), the images are mismatched by the offset 310.

While only two image captures are illustrated in the examples of FIGS. 2A-2D and FIGS. 3A-3C, it is to be understood that this is merely for convenience and ease of description. In implementation, the disclosure is not so limited and there may be any number of image captures (i.e., frames) to be aligned for high resolution composition. Furthermore, that the first image (e.g., 301A), being the first image captured, is considered the reference image is also only by way of example. In implementation, any of the plurality of captured images may be considered the reference image. Both of these aspects of the present disclosure will be discussed in more detail below.

A first aspect of the present disclosure is to provide an apparatus and method for aligning (i.e., registering) one image relative to another image. In that regard, a progressive approximation method is provided. In more detail, an apparatus and method are provided that use a multi-resolution successive approximation of the alignment. This method has the advantage of being fast as well as robust in the presence of noise and repeating structures in images that may “fool” other alignment methods. In yet another embodiment of the present disclosure, the alignment method includes a morphing routine to correct for changes due to camera angle. This is important for handheld camera “snapshots” that are likely to be taken at fairly close range.

FIG. 4 illustrates a progressive approximation for aligning image captures according to an embodiment of the present disclosure.

Referring to FIG. 4, a first image capture, denoted in FIG. 4 as “Image Capture A” and a second image capture, denoted as “Image Capture B” are shown to illustrate an example of the progressive approximation method of the present disclosure.

In the example of FIG. 4, both the Image Capture A and the Image Capture B are first reduced in scale to 1/64 of their original size wherein each pixel of each reduced image corresponds to 64 pixels of the original image. After the first size reduction, a rough displacement of Image Capture B in relation to Image Capture A at 1/64 scale can be calculated. The calculated displacement is a rough displacement based on the low resolution of the reduced size images. After the first calculation is completed, the images are again reduced in scale from their original size, this time to 1/32 of their original size. Based on the first calculated displacement, a more precise second displacement can be calculated. This process continues until the images are returned to their original scale, which, in this example, requires six iterations. Notably, by doubling the scale with every iteration, the accuracy of the offset calculation (i.e., the calculated displacement) is also doubled.

FIGS. 5A and 5B illustrate two iterations of a progressive approximation method according to an embodiment of the present disclosure.

Referring to FIG. 5A, a first iteration of a progressive approximation method compares a first image capture (Image A) with a second image capture (Image B) to determine a rough displacement of the second image capture with respect to the first image capture. As illustrated in FIG. 5A, Image A includes a plurality of low resolution pixels 501, as does Image B. To determine the rough displacement in the first iteration, all pixels of Image B are shifted in each of nine possible shift offsets 510 and each shifted image is compared to Image A. Again, based on the reduction in size of Image A and Image B, one pixel 501 corresponds to 64 pixels of the original image. Hence, at that low scale, a shift of one pixel is substantial when considering movement in the original image.

The nine possible shift offsets 510 in Image B include an upper left offset (ul), an up offset (up), an upper right offset (ur), a left offset (lf), a center or no offset (cn), a right offset (rt), a lower left offset (ll), a down offset (dn), and a lower right offset (lr). The first iteration determines which of the nine possible shift offsets 510 will minimize the differences between pixel values of Image A and Image B. In an embodiment of the present disclosure, the difference in pixel values at each offset is determined according to Equation (1).

$\begin{matrix} {{{cn} = {\sum\limits_{{All}\mspace{11mu} {Pixels}}{{ABS}\left( {A - B_{{no}\mspace{11mu} {shift}}} \right)}}}{{up} = {\sum\limits_{{All}\mspace{11mu} {Pixels}}{{ABS}\left( {A - B_{{shifted}\mspace{14mu} {up}}} \right)}}}{{ur} = {\sum\limits_{{All}\mspace{11mu} {Pixels}}{{ABS}\left( {A - B_{{shifted}\mspace{14mu} {upper}\mspace{14mu} {right}}} \right)}}}{{rt} = {\sum\limits_{{All}\mspace{11mu} {Pixels}}{{ABS}\left( {A - B_{{shifted}\mspace{14mu} {right}}} \right)}}}\vdots {{ul} = {\sum\limits_{{All}\mspace{11mu} {Pixels}}{{ABS}\left( {A - B_{{shifted}\mspace{14mu} {upper}\mspace{14mu} {left}}} \right)}}}} & {{Equation}\mspace{14mu} (1)} \end{matrix}$

In Equation (1), it is assumed that the offset determination is performed with the full Red, Green, and Blue (RGB) data set when comparing the pixel values of Image A and Image B. However, as an alternative, the algorithms may be run on the green channels only as a method of saving power or processing resources. In that case, the offsets from the green channels can be used to composite all channels of the pixels. The results of each determination of Equation (1) are compared using Equation (2) to determine a value of X_(offset) and Y_(offset). offset.

$\begin{matrix} {{{MIN} = {\min \left( {{ul},{up},{ur},{lf},{cn},{rt},{ll},{{dn}\mspace{14mu} {lr}}} \right)}}{{{{if}\mspace{14mu} {cn}} = {MIN}},{{{then}\mspace{14mu} X_{offset}} = 0},{Y_{offset} = 0}}{{{{if}\mspace{14mu} {up}} = {MIN}},{{{then}\mspace{14mu} X_{offset}} = 0},{Y_{offset} = {- 1}}}{{{{if}\mspace{14mu} {ur}} = {MIN}},{{{then}\mspace{14mu} X_{offset}} = 1},{Y_{offset} = {- 1}}}{{{{if}\mspace{14mu} {rt}} = {MIN}},{{{then}\mspace{14mu} X_{offset}} = 1},{Y_{offset} = 0}}\vdots {{{{if}\mspace{14mu} {ul}} = {MIN}},{{{then}\mspace{14mu} X_{offset}} = 1},{Y_{offset} = 1}}} & {{Equation}\mspace{14mu} (2)} \end{matrix}$

Based on the results of Equation (2), a first displacement (i.e., an X_(offset) and a Y_(offset)) of Image B with respect to Image A can be determined. That is, although all pixels of Image B are displaced according to the nine shift offsets, only one of the shifted images is chosen based on the result of Equation (2). Again, the displacement value is based on a scaled down image (e.g., a 1/64^(th) image) such that the displacement value is only a rough approximation. To further refine the determination of the displacement, a second and subsequent iterations are needed. In the example of FIG. 5A, it is assumed that the smallest difference between Image B and Image A occurs when Image B is shifted to the right offset (i.e., rt) such that the X_(offset)=1 and the Y_(offset)=0.

Referring to FIG. 5B, a second iteration of the progressive approximation method is illustrated. In the second iteration, the X_(offset) and Y_(offset) results from the first iteration are used as a starting point and a displacement in each of nine possible shift offsets 530 is again calculated. That is, the second iteration continues from the previous offset location but the resolution coordinates are now doubled to a scale of 1/32. Hence, the previously determined X_(offset) and Y_(offset) based on a 1/64 scaling are doubled to reflect the current 1/32 scaling and used as starting offsets for the second iteration. That is, the starting X_(offset) and Y_(offset) for the second iteration are determined using Equation (3).

X _(offset 1/32) =X _(offset 1/64)*2

Y _(offset 1/32) =Y _(offset 1/64)*2  Equation (3)

Using Equation (3) to determine the starting values for the example of FIG. 5B, it is found that the starting value of X_(offset)=(1)*2=2 and the starting value of Y_(offset)=(0)*2=0. Hence, the Image B is further shifted by one pixel coordinate in each of the nine possible shift offsets 530 from the starting location of (2,0), and Equation (1) and Equation (2) are used to determine which offset of Image B results in a smallest difference with Image A.

In the example of FIG. 5B, it is assumed that shifting Image B to the lower left by one pixel results in the smallest difference between Image A and Image B. According to Equation (2), a shift to the lower left results in an X_(offset)=−1 and a Y_(offset)=1. The offsets determined in the second iteration are added to the offsets determined in the first iteration using the current resolution coordinates. Hence, the X_(offset) and Y_(offset) for the second iteration would be determined using Equation (4).

X _(offset) =X _(offset 1/64)*2+new X _(offset)

Y _(offset) =Y _(offset 1/64)*2+new Y _(offset)  Equation (4)

In the example of FIG. 5B, the result of Equation (4) would be X_(offset)=(1)*2+(−1)=1, and Y_(offset)=(0)*2+(1)=1.

Although only two iterations are illustrated in FIGS. 5A and 5B, it is understood that the remaining iterations would follow substantially the same process. That is, the values of X_(offset) and Y_(offset) determined from the previous iteration and modified by an equation similar to Equation (3) (i.e., multiplying the value by 2) would be used as a starting point for further shifting Image B by one pixel coordinate in each of nine possible shift offsets. Equation (1) and Equation (2) would be used to determine a minimum difference between Image A and Image B based on the nine possible shifts, and an equation similar to Equation (4) (i.e., previous offset*2+new offset) would be used to determine the resultant offsets for the current iteration.

FIG. 6 illustrates a result of four iterations of a progressive approximation according to an embodiment of the present disclosure.

Referring to FIG. 6, a first iteration results in displacement of the pixel coordinate towards the right offset (i.e., rt), a second iteration results in displacement of the pixel towards the lower left offset (i.e., ll), a third iteration results in displacement of the pixel towards the upper offset (i.e., up) and a fourth iteration again results in displacement of the pixel towards the right offset (i.e., rt). Notably, with each iteration illustrated in FIG. 6, the magnitude of the vector offset decreases. That is, with each iteration, while the image is only displaced by one pixel coordinate, the one pixel coordinate becomes progressively smaller based on the increased scale of the image (e.g., 1/64 to 1/32 to 1/16 to ⅛). Hence, starting with a 1/64 scale, an offset accurate to within 1 pixel of the native resolution can be calculated after only seven iterations. Table 1 describes the relationship between the number of iterations, the scale of the images, and a resultant error of the displacement calculation in pixels.

TABLE 1 Iteration Scale Error 1 1/64 Offset calculated to nearest 64 pixels; error = +/−32 pixels 2 1/32 Offset calculated to nearest 32 pixels; error = +/−16 pixels 3 1/16 Offset calculated to nearest 16 pixels; error = +/−8 pixels 4 ⅛ Offset calculated to nearest 8 pixels; error = +/−4 pixels 5 ¼ Offset calculated to nearest 4 pixels; error = +/−2 pixels 6 ½ Offset calculated to nearest 2 pixels; error = +/−1 pixel 7 1 Offset calculated to nearest pixel; error = +/−.5 pixels 8 2 Offset calculated to nearest .5 pixels; error = +/−.25 pixels 9 4 Offset calculated to nearest .25 pixels; error = +/−.125 pixels

To achieve a resultant image having super-resolution, fractional pixel offsets are needed for insertion as samples between pixels of the reference image. As can be seen in Table 1, by the seventh iteration, the algorithm calculates a displacement to the nearest pixel. To achieve the necessary fractional pixel offsets, additional iterations are necessary. In one embodiment, the algorithm is further designed to iterate two more times beyond unity scaling. However, rather than using downscaled image data, both Image A and Image B are ‘up-sampled’ by a factor of two for each iteration beyond unity scaling. This up-sampling is reflected in Table 1.

In another embodiment, an optimization is performed in which Image B is re-sampled by a ½ pixel in all directions and compared to Image A at unity scale for the eighth iteration. For the final iteration, Image B is re-sampled to the nearest ¼ pixel in all directions and again compared to Image A at unity scale.

FIGS. 7A and 7B illustrate an eighth iteration and ninth iteration to determine fractional pixel offsets according to an embodiment of the present disclosure.

Referring to FIG. 7A, a linear interpolation is performed on Image B to re-sample the image by ½ pixel towards the right (i.e., rt). Although not shown, the linear interpolation is performed in each of the remaining 8 directions by ½ pixel. Once each ½ pixel linear interpolation is completed, the resultant re-sampled Image B that most closely matches Image A at unity scale is selected. In the example of FIG. 7A, it is assumed that the ½ pixel linear interpolation towards the right offset (i.e., rt) most closely matches Image A at unity scale.

Referring to FIG. 7B, based on the determination from FIG. 7A that the ½ pixel linear interpolation towards the right offset (i.e., rt) most closely matched Image A at unity scale, Image B is further re-sampled by ¼ pixel linear interpolation in nine possible directions from the right offset. That is, based on the ½ pixel re-sampling 701 that determined the right offset 703 most closely matched Image A at unity scale, a ¼ pixel re-sampling 705 is performed in nine possible offset directions, starting from the right offset of iteration 8, and the closest match to Image A at unity scale determines the final fractional offset.

Although two images are illustrated in the above examples beginning with FIG. 4, this is merely for ease of discussion. In implementation, any number of image captures may be used. However, it has been discovered as part of the present disclosure that several variables should be considered when determining the number of image captures to consider. For example, if an increased resolution of four times (i.e., 4×) the original image resolution is desired, it would be necessary to populate a total of 16 frames (i.e., 4× horizontally and 4× vertically). While the reference frame is available to populate one of the 16 frames, the super-resolution algorithm would ideally need 15 new frames, each with unique fractional offsets relative to the reference frame in order to populate all possible fractional states in between the reference samples. To reliably populate all 16 fractional offset states in any given locality a total of 30 image captures would be required. However, populating fewer than 16 fractional offset states still offers significant resolution increases while requiring considerably fewer frame memory and processing resources.

To better determine a number of image captures that should be used to obtain a satisfactory image without imparting a significant computation burden, A combinatorial model was constructed to compute the cumulative probability of landing on a certain number of unique fractional states (referred to as the number of “Hits”) within a certain number of captured frames. Every new frame until the 4th frame reliably offers new resolution information. Thus, four Hits (4=2× resolution) is highly likely with only four frames, and seven captured frames reliably produces six hits (6=2.4× resolution). To achieve greater than 50% probability of 8 hits (8=2.8× resolution) at every locality of the super-resolution image, one would need to invest at least 10 frames of memory and processing.

Based on experimental results, it has been discovered that it is generally efficient to populate half (or less than half) of the available resolution states. Beyond this half way point, an increasing number of redundancies occur resulting in progressively less efficient use of available memory and processing resources. Hence, for super-resolution system implementations with tight constraints on frame memory and processing resources, it has been found that 10 or fewer frame captures is sufficient.

It has further been discovered that the number of image captures had an impact on noise reduction when performing the super resolution algorithm according to embodiments of the present disclosure. More specifically, it has been discovered that super-imposing low-resolution frames has a significant noise reduction benefit that increases proportionally by the square root of the number of frames. Based on experimental results, it has been discovered that when four frames are used for the super-resolution algorithm, noise is reduced to 50% its original value. When 10 frames are used, noise is reduced to 31% its original value. And when 25 frames are used, noise is reduced to 20% its original value. In terms of Signal to Noise Ratio (SNR), SNR is increased 2× with four frames, 3.2× with 10 frames, and 5× with 25 frames. Finally, in terms of SNR gain in (dB), four frames increased SNR gain by 6 dB, 10 frames increased SNR gain by 10 dB and 25 frames increased SNR gain by 14 dB.

Also for ease of discussion, the above examples beginning in FIG. 4 illustrate a first iteration at a scale 1/64^(th) the original image. However, this too is not intended to be limiting. Rather, it has been discovered as part of the present disclosure that a down scale factor for the first iteration of the registration algorithm should match the maximum offset of the frame sequence (to the nearest power of 2). For example, in the illustration of FIG. 1A, the maximum offset of the camera held in the portrait orientation was 82 pixels. In that case, a down scale factor of 64 (i.e., 1/64 scale) may be insufficient, although it would be sufficient for the examples in FIGS. 1B and 1C in which the maximum offsets were 49 pixels and 50 pixels, respectively. Accordingly, while the down scale factor was to 64 (i.e., scaled to 1/64) in the above examples, it must be understood that this may be easily adjusted based on many factors such as collection of more product-specific user data, available resources and processing costs, and the like.

As part of the present disclosure, it was discovered that sequential image captures typically include perspective, scale, rotation, and translation differences caused by random camera tilt along various axes. Thus, an embodiment of the present disclosure provides an image registration algorithm that not only aligns (i.e., registers) sequential images but also transforms them to the same camera perspective.

FIGS. 8A-8C illustrate two image captures requiring additional morphing techniques according to an embodiment of the present disclosure.

Referring to FIGS. 8A-8C, a subject 801 is shown in a first image capture 810 in FIG. 8A, and the same subject 801 is shown in a second, sequential image capture 820 in FIG. 8B. As illustrated in FIG. 8C, the second image capture 820 has a linear distortion with reference to the first image capture 810.

FIGS. 9A-9C illustrate a hybrid global and local registration algorithm for linearly distorted frames according to an embodiment of the present disclosure.

Referring to FIG. 9A, a second image capture 903 is illustrated overlapping a first image capture 901 to determine a rough offset and to define the rough intersection. In FIG. 9B, a registration algorithm (with fractional offsets) is then run again on four small local areas 905 (e.g., 128×128 pixels) at the corners of the rough intersection. Finally, in FIG. 9C, the rough global offset added to the four local offsets are used to define four offset vectors corresponding to the center point of each corner area 905. These four offset vectors are used to linearly “distort” the second image capture 903 rather than simply translating it, prior to composition of the super-resolution image.

FIG. 10 illustrates a progressive approximation for aligning image captures according to another embodiment of the present disclosure.

Referring to FIG. 10, a first image capture, denoted in FIG. 10 as “Image Capture A” and a second image capture, denoted as “Image Capture B” are shown to illustrate another example of a progressive approximation method of the present disclosure. In the example of FIG. 10, both the Image Capture A and the Image Capture B are first reduced in scale to 1/81 of their original size. In comparison to the embodiment illustrated in FIG. 4, the progressive scaling has now been performed in powers three, rather than powers of two (i.e., 1/64 scale, 1/32 scale, etc.). It has been discovered that a first advantage of scaling in powers of three is that only five iterations are required to reach unity scaling, as compared to seven iterations when scaling by powers of two. Furthermore, when scaling by powers of three, there is less redundancy in each search state. However, it has been found that indexing when scaling by powers of three is slightly more complex than when scaling by powers of 2.

FIG. 11 is a flowchart illustrating a method for performing a progressive approximation for aligning image captures according to an embodiment of the present disclosure.

Referring to FIG. 11, data of N images captures is read in to a memory, such as a buffer memory, for processing in operation 1101. In operation 1103, one of the N image captures is selected as a reference image. As discussed above, each of the other image captures will be compared to the reference image to determine its offset. In an embodiment, a middle image capture of all the image captures (e.g., image capture N/2) may be used as the reference image capture. In operation 1105, a counter X is set to 1 and it is determined in operation 1107 if the counter is less than the value of N. The purpose of the counter in this instance is to ensure that operations 1109 and 1111 are performed for each image capture with the reference image, that is, each image pair. In operation 1109, an offset of each image capture relative to the reference image capture is determined. That is, as explained above, an iterative algorithm is used to determine the offset of the image capture relative to the reference image capture until unity scaling is reached. In operation 1111, a fractional offset of each of four corners of the image pair is determined. The fractional offsets will be used for image morphing. In operation 1113, the counter is increased by 1 and the process returns to operation 1107.

When it is determined in operation 1107 that all image pairs have been considered, the process proceeds to operation 1115 at which the counter is again set to 1. In operation 1117, it is determined if the counter is less than or equal to the number of image captures N. The purpose of the counter in this instance is to ensure that operations 1119 to 1123 are performed for all image captures. In operation 1119, morphing is performed of image capture X according to the four corners determined in operation 1111. In operation 1121, the image capture X is scaled up by a factor of four to achieve a super-resolution for that image capture. In operation 1123, the pixel sums for image capture X are read into a sum buffer for later use in determining a final value for each pixel. In operation 1125, the counter is increased by 1 and the process returns to operation 1117.

When it is determined in operation 1117 that all image captures have been considered, the process proceeds to operation 1127 at which the counter is set to the number of pixels. In operation 1129, it is determined if the counter is equal to zero. The purpose of the counter in this instance is to ensure that all pixels of the image are considered. In operation 1131, an average pixel value is determined based on the sum of pixel values determined in operation 1123. In operation 1133, the counter is increased by 1 and the process returns to operation 1129.

When it is determined in operation 1129 that all pixels have been considered, the process proceeds to operation 1135 in which image sharpening may be performed by an appropriate filter. Notably, operation 1135 is optional.

FIG. 12 is a block diagram schematically illustrating a configuration of an electronic device according to an embodiment of the present disclosure.

Referring to FIG. 12, an electronic device 1200 may include a control unit 1210, a storage unit 1220, an image processing unit 1230, a display unit 1240, an input unit 1250, a communication unit 1260, and a camera unit 1270.

According to various embodiments of the present disclosure, the electronic device 1200 comprises at least one control unit 1210. The at least one control unit 1210 may be configured to operatively control the electronic device 1200. For example, the at least one control unit 1210 may control operation of the various components or units included in the electronic device 1200. The at least one control unit 1210 may transmit a signal to the various components included in the electronic device 1200 and control a signal flow between internal blocks of the electronic device 1200. The at least one control unit 1210 may be or otherwise include at least one processor. For example, the at least one control unit 1210 may include an Application Processor (AP), and/or the like.

The storage unit 1220 may be configured to store user data, and the like, as well a program which performs operating functions according to various embodiments of the present disclosure. The storage unit 1220 may include a non-transitory computer-readable storage medium. As an example, the storage unit 1220 may store a program for controlling general operation of an electronic device 1200, an Operating System (OS) which boots the electronic device 1200, and application program for performing other optional functions such as a camera function, a sound replay function, an image or video replay function, a signal strength measurement function, a route generation function, image processing, and the like. Further, the storage unit 1220 may store user data generated according to a user of the terminal 1220, such as, for example, a text message, a game file, a music file, a movie file, and the like. According to various embodiments of the present disclosure, the storage unit 1220 may store an application or a plurality of applications that individually or in combination operate a camera unit 1270 to capture (e.g., contemporaneously) one or more images of substantially the same viewpoint, and/or the like. According to various embodiments of the present disclosure, the storage unit 1220 may store an application or a plurality of applications that individually or in combination operate the image processing unit 1230 or the control unit 1210 to perform any of the functions, operations or steps as described above. The storage unit 1220 may store an application or a plurality of applications that individually or in combination operate the control unit 1210 and the communication unit 1260 to communicate with a counterpart electronic device to receive one or more images from the counterpart electronic device, and/or the like. The storage unit 1220 may store an application or a plurality of applications that individually or in combination operate display unit 1240 to display a graphical user interface, an image, a video, and/or the like.

The display unit 1240 displays information inputted by a user or information to be provided to the user as well as various menus of the electronic device 1200. For example, the display unit 1240 may provide various screens according to the user such as an idle screen, a message writing screen, a calling screen, a route planning screen, and the like. According to various embodiments of the present disclosure, the display unit 1240 may display an interface which the user may manipulate or otherwise enter inputs via a touch screen to enter selection of the function relating to the signal strength of the electronic device 1200. The display unit 1240 can be formed as a Liquid Crystal Display (LCD), an Organic Light Emitting Diode (OLED), an Active Matrix Organic Light Emitting Diode (AMOLED), and the like. However, various embodiments of the present disclosure are not limited to these examples. Further, the display unit 1240 can perform the function of the input unit 1250 if the display unit 1240 is formed as a touch screen.

The input unit 1250 may include input keys and function keys for receiving user input. For example, the input unit 1250 may include input keys and function keys for receiving an input of numbers or various sets of letter information, setting various functions, and controlling functions of the electronic device 1200. For example, the input unit 1250 may include a calling key for requesting a voice call, a video call request key for requesting a video call, a termination key for requesting termination of a voice call or a video call, a volume key for adjusting output volume of an audio signal, a direction key, and the like. In particular, according to various embodiments of the present disclosure, the input unit 1250 may transmit to the at least one control unit 1210 signals related to the operation of a camera unit (not shown), to selection of an image, to selection of a viewpoint, and/or the like. Such an input unit 1250 may be formed by one or a combination of input means such as a touch pad, a touchscreen, a button-type key pad, a joystick, a wheel key, and the like.

The communication unit 1260 may be configured for communicating with other electronic devices and/or networks. According to various embodiments of the present disclosure, the communication unit 1260 may be configured to communicate using various communication protocols and various communication transceivers. For example, the communication unit 1260 may be configured to communicate via Bluetooth technology, NFC technology, WiFi technology, 2G technology, 3G technology, LTE technology, or another wireless technology, and/or the like.

The camera unit 1270 may be configured to capture one or a plurality of images and provide the data of the captured one or more images to the control unit 1210 for processing.

FIG. 13 is a block diagram of an applications processor configured to perform a progressive approximation for a super-resolution image according to an embodiment of the present disclosure.

Referring to FIG. 13, an applications processor 1300 may include components of a conventional architecture such as a pre-processor 1310, a Fully Integrated Mobile Display (FIMD) 1320, and a CREO 1330. According to an embodiment of the present disclosure, the applications processor 130 may further include an Image Signal Processor (ISP) 1340, a super-resolution multi-frame processor 1350, and a frame memory 1360.

The pre-processor 1310 may be configured to receive an input 1301, such as data from an image sensor or digital camera. After processing the received data (e.g., lens shading, addressing flicker, etc.), the pre-processor provides the pre-processed data to the ISP 1340. The ISP 1340 performs additional functions on the data such as conversion from RGB format to Ycbcr format, white balancing, color saturation enhancement, and the like. Moreover, the ISP 1340 provides the received data to the super-resolution multi-frame processor 1350 for performing any or all of the functions as described above. Based on the additional memory needs of the super-resolution multi-frame processor 1350, a frame memory 1360 may also be provided. Upon completion of the super-resolution processing, the super-resolution multi-frame processor 1350 provides an output 1305 including an image having an enhanced resolution. The output may be provided to an external storage, a display unit, and the like. In a normal mode, the ISP 1340 provides an output for further processing to the FIIMD 1320 and the CREO 1330 which may ultimately provide an output signal to a display unit, such as display unit 1240 for FIG. 12.

In an alternative embodiment, the pre-processor 1310 may output raw data directly to the super-resolution multi-frame processor 1350. In that case, the super-resolution multi-frame processor 1350 may output super raw data to the ISP 1340. However, this option requires the ISP 1340 to operate on a much larger data set.

It will be appreciated that various embodiments of the present disclosure according to the claims and description in the specification can be realized in the form of hardware, software or a combination of hardware and software.

Any such software may be stored in a non-transitory computer readable storage medium. The non-transitory computer readable storage medium stores one or more programs (software modules), the one or more programs comprising instructions, which when executed by one or more processors in an electronic device, cause the electronic device to perform a method of the present disclosure.

Any such software may be stored in the form of volatile or non-volatile storage such as, for example, a storage device like a Read Only Memory (ROM), whether erasable or rewritable or not, or in the form of memory such as, for example, Random Access Memory (RAM), memory chips, device or integrated circuits or on an optically or magnetically readable medium such as, for example, a Compact Disk (CD), Digital Versatile Disc (DVD), magnetic disk or magnetic tape or the like. It will be appreciated that the storage devices and storage media are various embodiments of non-transitory machine-readable storage that are suitable for storing a program or programs comprising instructions that, when executed, implement various embodiments of the present disclosure. Accordingly, various embodiments provide a program comprising code for implementing apparatus or a method as claimed in any one of the claims of this specification and a non-transitory machine-readable storage storing such a program.

While the disclosure has been shown and described with reference to various embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the disclosure as defined by the appended claims and their equivalents. Various embodiments of the present disclosure are described as examples only and are noted intended to limit the scope of the present disclosure. Accordingly, the scope of the present disclosure should be understood as to include any and all modifications that may be made without departing from the technical spirit of the present disclosure. 

What is claimed is:
 1. A method for increasing the resolution of an image, the method comprising: capturing a plurality of frames of an image; determining a reference frame from among the plurality of frames; iteratively determining an offset of each of the plurality of frames to the reference frame until unity scaling is reached; and determining a pixel value for insertion between pixels of the reference frame.
 2. The method of claim 1, wherein the iteratively determining the offset of each of the plurality of frames comprises down scaling each of the plurality of frames and the reference frame.
 3. The method of claim 2, wherein the down scaling is performed by powers of two.
 4. The method of claim 2, wherein the down scaling is performed by powers of three.
 5. The method of claim 1, wherein the determining of the reference frame comprises determining the number N of the plurality of frames and selecting the N/2 frame as the reference frame.
 6. The method of claim 1, wherein the determining of the pixel value for insertion comprises up scaling each of the plurality of frames and the reference frame.
 7. The method of claim 1, wherein the determining of the pixel value for insertion comprises re-sampling each of the plurality of images by linear interpolation and comparing the re-sampled image to the reference image at unity scaling.
 8. The method of claim 7, wherein the re-sampling is performed by at least one of ½ pixel and ¼ pixel.
 9. The method of claim 8, further comprising summing the values of each pixel for each of the plurality of frames and determining an average value for each pixel.
 10. The method of claim 1, further comprising distorting each of the plurality of frames in relation to the reference frame.
 11. The method of claim 10, wherein the distorting of each of the plurality of frames comprises determining a fractional offset at each corner of an intersection of the frame and the reference frame.
 12. An apparatus for increasing the resolution of an image, the apparatus comprising: a camera unit configured to capture a plurality of frames of an image; and a control unit configured to determine a reference frame from among the plurality of frames, to iteratively determine an offset of each of the plurality of frames to the reference frame until unity scaling is reached, and to determine a pixel value for insertion between pixels of the reference frame.
 13. The apparatus of claim 12, wherein the control unit is configured to iteratively determine the offset of each of the plurality of frames by down scaling each of the plurality of frames and the reference frame.
 14. The apparatus of claim 13, wherein the down scaling is performed by powers of two.
 15. The apparatus of claim 13, wherein the down scaling is performed by powers of three.
 16. The apparatus of claim 12, wherein the control unit is configured to determine the reference frame by determining the number N of the plurality of frames and selecting the N/2 frame as the reference frame.
 17. The apparatus of claim 12, wherein the control unit is configured to determine the pixel value for insertion by up scaling each of the plurality of frames and the reference frame.
 18. The apparatus of claim 12, wherein the control unit is configured to determine the pixel value for insertion by re-sampling each of the plurality of images by linear interpolation and comparing the re-sampled image to the reference image at unity scaling.
 19. The apparatus of claim 18, wherein the re-sampling is performed by at least one of ½ pixel and ¼ pixel.
 20. The apparatus of claim 19, wherein the control unit is further configured to sum the values of each pixel for each of the plurality of frames and determine an average value for each pixel.
 21. The apparatus of claim 12, wherein the control unit is further configured to distort each of the plurality of frames in relation to the reference frame.
 22. The apparatus of claim 21, wherein the control unit is configured to distort each of the plurality of frames by determining a fractional offset at each corner of an intersection of the frame and the reference frame. 